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ABSTRACT 

We consider inference procedures, conditional on an observed ancillary statis- 
tic, for regression coefficients under a linear regression setup where the un- 
known error distribution is specified nonparametrically. We establish condi- 
tional asymptotic normality of the regression coefficient estimators under reg- 
ularity conditions, and formally justify the approach of plugging in kernel-type 
density estimators in conditional inference procedures. Simulation results show 
that the approach yields accurate conditional coverage probabilities when used 
for constructing confidence intervals. The plug-in approach can be applied in 
conjunction with configural polysampling to derive robust conditional estima- 
tors adaptive to a confrontation of contrasting scenarios. We demonstrate this 
by investigating the conditional mean squared error of location estimators un- 
der various confrontations in a simulation study, which successfully extends 
configural polysampling to a nonparametric context. 

Key words and phrases: ancillary; bandwidth; conditional inference; config- 
ural polysampling; confrontation; plug-in. 



Kong Special Administrative Region, China (Project No. HKU 7104/OlP). 



1 Introduction 



The classical conditionality principle (Fisher (1934, 1935); Cox and Hinkley 
(1974)) demands that statistical inference be made relevant to the data at hand 
by conditioning on ancillary statistics. Arguments for this are best seen from 
examples in Cox and Hinkley ((1974), Ch.2). Further discussion can be found 
in Barndorff-Nielsen (1978) and in Lehmann (1981). Under regression mod- 
els, the ancillary statistic takes the form of studentized residuals. Conditional 
inference about regression coefficients has been discussed by Fraser (1979), 
Hinkley (1978), DiCiccio (1988), DiCiccio, Field and Fraser (1990), and Sev- 
erini (1996), among others. When the error density is completely specified, 
approximate conditional inference can be made by Monte Carlo simulation 
or by using numerical integration techniques. The procedure nevertheless be- 
comes computationally intensive if the parameter has a high dimension, in 
which case large-sample approximations such as those proposed by DiCiccio 
(1988) and DiCiccio, Field and Fraser (1990) may be necessary. In a nonpara- 
metric context where the error density is unspecified, conditional inference has 
not received much attention despite its clear practical relevance. Fraser (1976) 
and Severini (1994) tackle the special case of location models. Both suggest 
plugging in kernel density estimates but provide no theoretical justification for 
the approach nor any formal suggestion on the choice of bandwidth. The need 
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for sophisticated Monte Carlo or numerical integration techniques endures, and 
the computational cost is even more expensive than that required by the para- 
metric case. Details of the computational procedures can be found in Severini 
(1994) and Seifu, Severini and Tanner (1999). In the present paper we prove 
asymptotic consistency, conditional on the ancillary statistic, of plugging in 
the kernel density estimator, and derive the orders of bandwidths sufficient 
for ensuring such consistency. Our proof also suggests a normal approxima- 
tion to the plug-in approach which is computationally much more efficient for 
high- dimensional regression estimators. 

Consideration of conditionality has motivated different notions of robust- 
ness for regression models: see Eraser (1979), Barnard (1981, 1983), Hinkley 
(1983) and Severini (1992, 1996). Morgenthaler and Tukey (1991) propose a 
configural polysampling technique for robust conditional inference, which com- 
promises results obtained separately from a confrontation of contrasting error 
distributions and provides a global perspective for robustness. Our plug-in 
approach extends configural polysampling to a nonparametric context, sub- 
stantially broadens the scope of confrontation, and enhances the global nature 
of the robustness attributed to the resulting inference procedure. 

Section 2.1 describes the problem setting. Section 2.2 reviews a boot- 
strap approach to unconditional inference for regression coefficients. The case 
of conditional inference is treated in Section 2.3. Section 3 investigates the 
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asymptotics underlying the plug-in approach. Section 4 reviews configural 
polysampling and extends it to nonparametric confrontations by the plug-in 
approach. Empirical results are given in Section 5. Section 6 concludes our 
findings. All proofs are given in the Appendix. 

2 Inference for regression coefficients 
2.1 Problem setting 

Consider a linear regression model 1^ = xj(3 + ii, for i = 1, . . . ,n, where 
Xi — {xii, . . . , Xip)'^ is the vector of covariates, /3 = . . . , is the vec- 
tor of unknown regression coefficients, and the random errors ei, . . . ,e„ are 
independent and identically distributed with density / symmetric about 0. 
Write Y = (Yi, . . . , Yn)^, X = [xi, Xn]'^ and e = (ei, . . . , in)^. Intro- 
duction of a scale parameter leads to a regression-scale model under which 
f{u) — /o(ii/a")/(7 for an unknown scale a > 0, and a density /o with unit 
scale. In this case we have e = ae = cr(ei, . . . , e„)^, for independent ei, . . . , e„ 
distributed with density /o, so that Y — XP + ae. Throughout the paper 
we treat (3 as the parameter of interest and /, or equivalently, (cr, /o), as the 
nuisance parameter of possibly infinite dimension. 

Let P — P{Y) be a location and scale equivariant estimator of P and, 
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under the regression-scale model, a = ^{Y) be a location invariant and scale 
equivariant estimator of a, so that l3{Xc + dy) — c + d(5{y) and a{Xc + 
dy) = \d\a{y) for any {d,c,y) e M x x M**. For example, /3 may be the 
least squares estimator and o"^ the mean squared residuals. Define, for i — 
Ai — Yi — xjp and Ai — Ai/a. We can easily show that A — 
{Ai, . . . , An)'^ and A = {Ai, . . . , An)^ provide ancillary statistics under the 
regression model with known / and the regression-scale model with known /q, 
respectively. When /o, and hence /, is unspecified, exact conditional inference 
is not possible as the conditional likehhood of P depends in general on /q. 
Adopting J0rgensen's (1993) notion of I-sufficiency, we see that A is I-sufficient 
for /o, so that any relevant information about /o is contained in A. The 
same applies to A and /. Such ancillary-informed knowledge about / and /o 
forms the basis for nonparametric estimation of the conditional likelihood and 
facilitates nonparametric conditional inference in an approximate sense. 

2.2 Unconditional inference: a bootstrap approach 

Under the regression-scale model, the distribution Gt oi T — {(3 — /3)/a does 
not depend on (/?, a) and provides a basis for unconditional inference when 
/o is known. The same applies to the distribution Gu oi U = j3 — j3 under 
the regression model. Suppose now /o, and hence /, is unspecified except for 
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symmetry about 0. Under the regression-scale model, we may estimate Gt by 
the residual bootstrap method as follows. Let F„ be the empirical distribution 
of the 2n residuals ±^41, . . . , ±74„. For a random sample e* = (e^, . . . , e*)^ 
drawn from F„, construct a bootstrap resample Y* — XP + ae* and calculate 
P* — P{Y*) and a* — &{¥*). The distribution Gt is then estimated by the 
bootstrap distribution, Gt say, of 0* — (3)/cf*. Under the regression model, 
we replace A by A, calculate P* from the bootstrap resample Y* — XP + e* 
and estimate Gu by the bootstrap distribution Gu of P* — (5. 

2.3 Conditional inference: a plug-in approach 

Conditional inference about P replaces Gt and Gu used in the unconditional 
approach by, respectively, the conditional distributions Gt\a{,-\o) of T given 
A — a — (oi, . . . , a^)"^ and G^^j^[-\a) of U given A — a — (oi, . . . , a„)^. 

Consider first the regression-scale model. Define S = a/a. The conditional 
joint density of {S, T) given A — a has the expression 

n 

K(s,t|a) = Ci(a)s"-^J]^/o(s(ai + a;ft)), s > and i e R^, (1) 
1=1 

where Ci(a) is a normalizing constant depending on a. Denote by gT\A{,-\o) 
the conditional density of T given A ^ a. Then, for t eW and T C the 
integrals gT\A{t\^) — jPi{s,t\a) ds and Gt\a{'^\0') — Jt^q- 9'^\'^(^\^^ ^® 
approximated by either Monte Carlo or numerical integration if /o is known, 
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with increasing computational cost as p increases. When /o is unspecified, we 
note I-sufficiency of A for /o and propose estimating /o by a kernel density 
estimate based on a: fh{z\a) = {nh)~^ Yll=i ^ ((^ ~ ^i)/^)) where A; is a kernel 
function and /i > is the bandwidth. This leads to nonpar ametric estimates 
Gt\a and gT\A of Gt\a and gT\A respectively, which can again be approximated 
by either Monte Carlo or numerical integration methods. We term this the 
"plug-in" (PI) approach to distinguish it from the "residual bootstrap" (RB) 
approach introduced earlier to unconditional inference. The use of studentized 
residuals a in its derivation guarantees that fh{z\a) has unit scale asymptoti- 
cally. Under symmetry of /o, it might be beneficial in practice to use in place 
of fh its symmetrized version, fh{z\a) = {fh{z\a) + fh{-z\a))/2. 

Under the regression model, the distribution and density of U conditional 
on A — d are given, for U gMP and u e W, by Gjj^^{U\d) = J^^j^ g^^^{u\d) du 
and gjj^^{u\d) = Cs{d) Y[i=i fi'^i + ^T''^) respectively, for some constant 03(0). 
If / is unspecified, the PI approach substitutes / by or fh{-\a) to yield 

plug-in estimates Gjj^^ and gu^^^ which conditional inference can be based. 



3 Theory 

We consider first the asymptotic behaviour of Gt\a and G^i^, and then assess 
the PI approach by substituting kernel estimates for / and /q. Take £q = log /o 
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and £ = log / and assume the following regularity conditions. 

(Dl) /o is symmetric about and positive on [—C, C] for some C > 0. 

(D2) /o has uniformly bounded continuous derivatives up to order 3, with /q" 
being Lipschitz continuous. 

(D3) Ee\ EeH'f^{ef, EeH'^{ef and E \eH'^'{e)\ are finite for e ^ U 

We assume that X = — . . . depends on n and satisfies the 
following. 

(CI) X^Xn is positive definite for all n, and S = lim^^oo n~^X^X„ exists 
and is positive definite. 

(C2) (Generalized Noether condition) lim max ixn,^ {X^Xy^'^^Xn^^ — 0. 

n— »oo l<i<n 

(C3) sup„ {rr^ YTi=\{^n,i^ Xn,i)^^'^^ < oo for some 77 > 0. 

Note that (CI) and (C2) imply asymptotic normality of least squares esti- 
mators of /3: see Sen and Singer ((1993), Section 7.2). The location model 
provides a trivial example that satisfies (C1)-(C3). The following theorem 
derives the asymptotic conditional distributions of n^/^T and n^/^C/. 

Theorem 1 Assume (C1)-(C3), (D1)-(D3) and that (5 = p + Op{n-^/^) and 
a^a + Op{n-^/'^). Then 
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(i) under the regression-scale model, I^^'^(n^^'^T — X ^9) is standard normal 
conditional on A, up to order Op{n~^^'^), whereX = n~'^X^Xn XliLi ^o(^i)^ 

(a) under the regression model, X^^'^(n^^'^U — X^^9) is standard normal con- 
ditional on A, up to order Op{n^^^'^), where X = n~'^X^Xn Yl'i=i ^'(^i)^ 
and e = 71-'/'' Eti^n,d'i^^)- 

We see from Theorem [T] that the conditional distributions of n'/'^T and n'^'^U 
admit normal approximations with conditional means and covariance matrices 
depending on the score functions £q and The proof of Theorem [T] sug- 
gests that the conditional covariance matrices X~' and X~' equal, up to order 
Op(n^^/^), the deterministic matrices and where I = n~'X^Xn J (^0)^/0 
and / = n~'X'^Xn J (^')^/) whereas the conditional means X~'6 and X^'d have 
asymptotic unconditional distributions A^(0,/~^) and A^(0,/~^), respectively. 
It follows that exact unconditional inference about (3 may not be correct, not 
even to first order asymptotically, conditional on the ancillary residuals. For 
example, an unconditionally exact level 1 — a confidence set derived from 
Gt has conditional coverage converging in probability to the random limit 
(^i-iiQi-a - Z) for Z ~ A^(0,/"^), where $a denotes the p-variate A^(0, A) 
distribution and $/c(0i-a) = 1 — a for some covariance matrix /C. The only 
exception is when (3 is the exact maximum likelihood estimator of (3. 
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To validate the PI approach asymptotically, we assume that the kernel 
function k satisfies the following. 

(Kl) k has support [— c, c], for some c > 0, and is symmetric about 0. 

(K2) k is twice differentiable with k" being Lipschitz continuous. 

(K3) there exists some q > 2 such that J k = 1, J k{u) du = for j = 
1, . . . ,q — 1, and J u'^k{u) du ^ 0. 

First-order approximation of the PI approach amounts to substitution oi fh{-\A) 
or fh[-\A) for /o in the score £q that defines the conditional normal mean 
X~^9 and covariance matrix of n^/^T. We consider a shghtly different 
score estimator in the theoretical development below. This simplifies the 
proof and is asymptotically equivalent to the original PI proposal. Denote 
by the ancillary statistic A with A^ excluded, for i — 1, . . . ,n. Define, 
for i — 1, . . . , n and m — 0, 1, ... , the "leave-one-out" kernel estimator of 
j{m) f-^\z\A_,) = {{n - E,^^ k^'^Ki^ - ^j)/h). Symmetry of 

/o motivates an anti-symmetrized leave-one-out estimate of •^o(Aj) given by 

K,MiMA-i) = {fKMi\^-i)rhMi\A-i) - Li-MA-i)/L{-A,\A^,)] , 

for bandwidths h^^hi > 0. This leads to estimators of 9 and T, given by 

respectively. Similar steps lead to estimates 6'^ and of 9 and X, respectively, 
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under the regression model. The following theorem concerns consistency of 
the above estimators. 

Theorem 2 Assume (K1)-(K3), the conditions in TheoremUl and that km — *■ 
and nhf^~^^ — > oo, m = 0, 1. Then 

(i) under the regression- scale model, = T + Op{5i) = T + Op(l) and 

e\ = e + o,,{52) = e + Oj,{i); 

(a) under the regression model, l'^ = X + Op{5i) = X + Op(l) and 6'^ = 

e + 0^(62) = e + opii), 

where 61 = hl + h\+n-^l^{h^^'^ + hf'^) and 62 = hl + h\+n-^'^{h'^'"^ + h-'''^). 

Theorems [T] and [2] together justify the PI approach asymptotically and derive 
the valid orders of the bandwidths involved. Note that the conditional distri- 
butions of ra^/^T and n^/'^U can be estimated consistently by N{X'^^^6\X'^^^) 
and N{l'^^^d\l'^^^), respectively, provided that /iq, /ii — > and n/ip, nh\ 00. 
We term this the "normal approximate plug-in" (NPI) approach to distinguish 
it from the PI approach which directly simulates from, or numerically evalu- 
ates, Gt\a and G^^^. The normal approximation error can be kept to a min- 
imum of order Op(n-9/(5+29)) by setting hi oc n-^/^^^+^i\ ho = 0(^-^/(^+29)) 
and /iq ^ = 0(n^/(^^+^''^). Park (1993) introduced trimming constants to the 
estimated score i'h^^^-^ to correct for its occasional erratic behaviour. 
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Remark. Adaptive estimation constructs asymptotically efficient estimators 
by substituting nonparametric score estimates in a one-step maximum likeli- 
hood approximation. Stone (1975) considered adaptive estimation under the 
symmetric location model. Bickel (1982) extended the construction to hnear 
models. Under our regression setup, the adaptive estimator built upon an 
equivariant regression estimator has conditional and unconditional distribu- 
tions equivalent to first order, and can be viewed as an equivariant regression 
estimator with conditional mean recentered at the true regression parameter. 
This connection implies asymptotic equivalence of adaptive estimation and the 
NPI approach, suggesting that the latter can be approximated by uncondi- 
tional inference based on adaptive estimators. Many nonparametric methods, 
such as the bootstrap, that are intended mainly for unconditional inference 
are readily available for estimation of such unconditional distributions. 



4 Robustness and configural polysampling 

Morgenthaler and Tukey (1991) suggest a global, finite-sample, notion of ro- 
bustness that pays due attention to ancillarity. Their method, known as con- 
figural polysampling, makes robust inference by conditioning on an ancillary 
configuration of the observed data under a confrontation of rival parametric 
models. Its dissociation from asymptotic reasoning makes the method at- 
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tractive for finite samples and distinct from such conventional devices as the 
influence function and the breakdown point. Morgenthaler (1993) speciahzes 
it to linear models and develops computationally simple procedures for robust 
estimation. 

A key ingredient to conflgural polysampling is the choice of a confronta- 
tion pair {J-',Q), where and Q denote extremes, in a spectrum of error 
distributions of practical interest, under which inference is done separately 
and the resulting analyses combined in an optimal way. The approach can 
be generalized to deal with more than two distributions in the confronta- 
tion. Morgenthaler and Tukey (1991) suggest taking and Q to be the 
normal and slash distributions to encompass a spectrum ranging from light- 
to heavy-tailed distributions. To flx ideas, consider estimation of (3 by an 
equivariant estimator V, such that V{Y) — (3 + aV{A). When /o — 
the conditional mean squared error (cMSE) of V given A is minimized at 
V{A) = V:f{A) = -m:p[S'^T\A]/mr[S'^\Al leading to Pitman's (1939) famous 
estimator, an early example of optimal estimation driven by the conditional- 
ity principle. Thus we may write, for an arbitrary equivariant estimator V , 
cMSE^(V^IA) = cMSE^(l^^lA) + a'^¥.r[S'^\A]{V{A) - V:F{A)f. Morgenthaler 
and Tukey (1991) select a "bioptimal" V by minimizing Pj. x cMSE^(V|A) + 
Pg X cMSEg(y|>l), for a pair of shadow prices Pj: and Pg. Alternatively, a 
minimax estimator V oi j3 can be obtained by minimizing the maximum of 
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cMSEjf(V"|A) and cMSEg(y|A), which often amounts to solving the equation 
cMSEj^{V\A) = cMSEg{V\A). The regression model can be treated similarly. 
In confidence interval problems one may, for example, minimize the conditional 
mean interval length subject to correct unconditional coverages under T and Q. 
In general, configural polysampling fine-tunes statistical procedures to achieve 
simultaneous efficiency over a spectrum of distributions determined by (JF, Q). 
It can be generalized with data-driven choices of {J^,Q), thereby robustifying 
the inference procedure in a global sense. We envisage confrontations {J-', Q) 
which reflect practical concerns in robust statistical inference. For example, we 
may confront parametric with nonpar ametric approaches, unconditional with 
conditional approaches, asymptotic approximation with finite-sample meth- 
ods, small with large bandwidths in any kernel-based approach, or any two 
competing nonparametric approaches. In these possible confrontations, our 
PI or NPI approaches can play a prominent role in robustifying the inference 
outcome specific to the observed ancillary configuration. Further empirical 
evidence is presented in Section 5.2. 
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5 Empirical studies 



5.1 Confidence intervals 

Our first study compared the conditional coverage probabilities of the PI and 
NPI intervals with those of the exact unconditional and RB intervals. We 
considered the location model with p — 1 and /? = 1, and took / to be the 
Student's density, which satisfies (D1)-(D3). The "conditional" samples, 
all subject to a common observed value of A, were obtained by rejection sam- 
pling. Three sample sizes, n — 15, 30 and 100, were considered. The nominal 
level 1 — a was chosen to be 0.90, 0.91, . . . , 0.99. Each conditional coverage 
was estimated from 5000 "conditional" samples. Construction of the PI and 
NPI intervals was based on 5000 samples drawn from G^^^ and its normal 
approximation, respectively. The RB interval was based on 1000 bootstrap 
samples and the exact unconditional interval on 5000 samples drawn from / 
itself. The kernel function k was taken to be the standard normal density. 

The objective of this study is to demonstrate the importance of conditioning 
and the effectiveness of PI and NPI in constructing conditional confidence 
intervals. Despite its importance in practice, the issue of bandwidth selection 
is not our main interest and we set h — 1 throughout the study, the best choice 
in a pilot study done on four different sets of ancillary residuals. Conventional 
methods for practical bandwidth selection include the normal referencing rule, 
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cross-validation and the (conditional) bootstrap. Alternatively, an innovative 
approach can be based on configural polysampling under a confrontation of 
two extreme choices of bandwidth. This will be illustrated in Section 5.2. 
For the NPI approach we used the true /, rather than its kernel estimate, 
for computing i' in order to examine the effects on conditional coverages due 
exclusively to normal approximation. 

Figure [T] plots the conditional coverage errors against 1 — a for n = 15 for 
four different sets of A, chosen specifically such that the exact unconditional 
intervals undercover in two cases and overcover in the other two. We see 
that the exact unconditional interval has very large conditional coverage error 
compared to the two plug-in approaches, except for the fourth case where 
it outperforms the NPI approach. Surprisingly, the RB interval yields more 
accurate coverage than does the exact unconditional interval, although the 
former is designed primarily for estimating the latter. It is evident that U has 
very different unconditional and conditional distributions given our choices of 
A. The PI approach works effectively for all four choices of A. Inferior in 
general to the PI intervals, NPI nevertheless corrects the exact unconditional 
interval to some extent, although the correction is less remarkable when the 
unconditional interval overcovers. Similar conclusions are observed for n = 30 
and 100. We also investigated choices of A given which the exact unconditional 
interval is conditionally accurate. The results, not shown in this report, suggest 
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that both the PI and NPI intervals remain, as expected, accurate in those cases. 
5.2 Robust conditional estimation 

The second study illustrates applications of the PI approach in configural 
polysampling procedures for robust conditional inference. We considered three 
types of confrontation pairs, all reflecting genuine practical concerns: (i) the 
normal versus the slash distributions; (ii) the least squares method versus the 
PI approach based on bandwidth h — Cn~^^^, a multiple of the optimal order; 
and (iii) the PI approach based on contrasting bandwidths ha and hi,. Note 
that (i) was conceived by Morgenthaler and Tukey (1991) for achieving ro- 
bustness across symmetric, unimodal, distributions of different tail behaviour. 
Case (ii) contrasts conditional with unconditional inferences. Case (iii) sug- 
gests a practical robust solution, which respects the conditionality principle, 
to the problem of bandwidth selection in the PI method. In the study we set 
C = 0.1, 0.5, 1.0, 1.5, 2.0, 2.5 and /i„ = 0.1, = 2.0. The kernel k was taken to 
be the standard normal density. 

We considered again a location model and compared the mean squared 
error of minimax location estimates obtained under different confrontations. 
The least squares estimate, the sample mean, was also included for compari- 
son. Given a fixed set of residuals, we generated 100,000 "conditional" random 
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samples of sizes n = 15 and 30 from each of six different distributions: the Stu- 
dent's ti, the normal mixture ^N{—3, 1) + ^N{3, 1) and the centered beta dis- 
tributions with support [—5, 5] and shape parameters (1/2, 1/2), (2, 2), (1/2, 2) 
and (2, 1/2), among which the ti and /?(2, 2) densities have bell-like shapes and 
can be deemed to lie within the normal-slash spectrum. We are here not so 
much concerned with asymptotic validity as interested in robustness against 
model departures in a broad context. Indeed, all six distributions except the 
normal mixture fail to satisfy (D3). 

Table [T] reports the cMSE's of the various estimates, obtained by averaging 
over the conditional samples generated from each distribution. We see that 
confrontation types (ii) and (iii) give remarkably small cMSE compared to (i), 
which is even less accurate than the unconditional least squares estimate un- 
der distributions outside the normal-slash spectrum. Confrontation type (ii) 
outperforms (i) under all choices of C and most of the underlying distribu- 
tions except ti, under which use of large C in (ii) gives results comparable to 
(i). Particularly encouraging are the results obtained using confrontation (iii), 
which returns an accurate, robustified PI estimate for which the bandwidth is 
implicitly selected from candidate values lying between ha and hb- 
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5.3 A real data example 

DiCiccio (1988) and Sprott (1980, 1982) made conditional inference about a 
real location parameter (5 by fitting a location-scale model with tx error to 
Darwin's data (Fisher (1960), P.37) on 15 height differences between cross- 
and self-fertilized plants. We removed the t\ assumption, set /3 to be (I) the 
sample mean and (II) the sample median, both being location equivariant, 
and constructed 95% two-sided RB, PI and NPI intervals for /3 in both cases. 
The RB interval was based on 50,000 bootstrap samples. The NPI interval 
was built on the anti-symmetrized leave-one-out score estimate, for which the 
bandwidths Hq and hi were fixed to be (give the actual number, not formula) 
using the normal referencing rule. 

For (I), we calculated the RB and NPI intervals to be (2.46,39.43) and 
(8.78, 40.80), respectively, and the PI intervals to be (17.51, 24.45), (11.33, 35.84), 
(10.42, 39.12), (8.38, 42.15), (4.86, 44.44) and (2.24, 45.51) based on bandwidths 
h — mho, for m — 0.2,0.5,0.7,1.0,1.3,1.5, respectively. The results are in 
agreement with DiCiccio's (1988) and Sprott's (1980, 1982) findings, suggest- 
ing plausibility of their Student's t error assumption. The case (II) gives similar 
results except that the endpoints are shifted shghtly to the right, in general. 
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6 Conclusion 

We establish consistency of the PI approach to conditional inference and derive 
sufficient bandwidth orders. The NPI approach provides a computationally 
convenient normal approximation to it. Effectiveness of the approaches is 
confirmed by empirical findings. The computational cost of PI depends on the 
dimension p and the efficiency with which we can simulate from gT\A or gjj^^. 
The computing times for both plug-in approaches were found to be within 
seconds under the location model considered in Section 5.1. 

Incorporation of the plug-in approaches into confrontations extends con- 
figural polysampling to the nonparametric realm, rendering the resulting con- 
ditional inference an extra dimension of robustness. When applied to a con- 
frontation of two extreme bandwidths, the technique suggests an innovative 
solution, which observes the conditionality principle, to bandwidth selection 
in practical applications of the PI approaches. We remark that confrontations 
of more than two specifications of error density can be considered in configural 
polysampling to further robustify the inference outcome, although then the 
minimax algorithm is necessarily more computationally involved. 
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7 Appendix 



7.1 Proof of Theorem [T] 

Under the regression- scale model, we deduce, by a Taylor expansion of ([T]) in 
powers of ra"^/^, that the density of n^/^(log S*, T) conditional on A is propor- 
tional, up to Op{n-^/^), to the product of the N{J-^i/j, J~^) and N(X-^Q,X-^) 
density functions, where J = — Yl'i=i i^i'^oi^i) + ^f^oi^i)} i' = 
^-1/2^"^^ (^i^o(^«) + !)• This proves part (i). Part (ii) follows by similar, 
but simpler arguments. 

7.2 Proof of Theorem H 

Note that Linton and Xiao's (2001) Lemma 2 can be adapted to deduce that 
/^)(±A,|A_,) = ft\±A;) + 0,{h'^ + 1/2), ^ = 0, 1, (2) 

uniformly in z G {1, . . . It follows that 4o,hi(^i|^-0 = ^o(^0 + ^^(^i), 
and hence the result for 

Define 5f^ = ± [fiZ\±e,\A_,) - fiZ\±A,\A-^)} ■ That /3 and a are n^^. 
consistent implies that /i = log(o"/cr) and r = /^/a — P/a are both Op(ri^^/^). 
Write X = Yl^=i^n,i/fT'- Conditioning on e^, standard asymptotic theory yields 

= ft'''\±e^){±x^, -x) + 0,{hl + n-'I'h'^^-'/'), 
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= - (m + l)ft\±e.) + 0,{hl + n-'''h-^--''% 

so that 

+ 0,{n-'l'hl + n-'h~^--'/'). (3) 
Noting ([3]), and that ^ also holds if ±Ai is replaced by ie,, we have 

+ [/o(eiX.(r + - /x/o(q)] /^(e,)//o(e,)2 + {n-'lH^) . (4) 

Expanding about e^, we have 

= /o(eO//o(eO- K.(r + /x/5/a) + /.e.]/aeO//o(eO 

+ K.(r + ^^PI<y) + /^e.] /^(e,)V/o(6,)' + 0,{n-'). (5) 

Symmetry of /o and (D3) together imply that n'^/'^YJi=i^n,if'o{,(^i)/fo{.(-i), 
^_i/2 ^n^^ x„,iei/o (ei)//o(ei), and n"^/^ X^"^^ Xn,ieifo{eiY / fo{ei)'^ are all of or- 
der Op(l). It then follows from (jlj) and that 

n 

^^t _^ = {/;:^(e.|A_,)/A„(6,|A_,) - /^(e,)//o(e,)} +0,(52). (6) 
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The proof of Linton and Xiao's (2001) Theorem 1 can be adapted to show that 
the first term in has order Op{6i), which can be absorbed into Op{62)- This 
completes the proof of (i). Part (ii) follows by similar arguments. 
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Figure 1: Conditional coverage errors of exact unconditional, PI, NPI and RB 

intervals, for n = 15. 
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Table 1: Conditional mean squared errors of least squares estimates (LS) and 
minimax estimates obtained under confrontations (i) normal vs slash, (ii) LS 
vs PI, (iii) ha = 0.1 vs h = 2.0 in PL 
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